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Title: Nucleic acid pool and mehtod for producing the same 

INDUSTRIAL FIELD 

The present invention relates to a nucleic acid pool to be 
5 produced by ligating oligonucleotides at random, and a method 
for producing it, and also to a genetic product to be produced 
by expressing the nucleic acid existing in the pool as a gene, . 

BACKGROUND ART 

10 One approach to protein engineering for improving 

naturally-existing proteins to modified ones which are more 
useful to human beings is to improve proteins through site- 
specific mutation, which has produced some results (Japanese 
Patent Application Laid-open No. 5-91876) . However, this re- 

15 uires the clarification or identification of the 
stereostructure of the targeted protein, and much labor is 
needed for the analysis of the stereostructure. In addition, 
even though the stereostructure could be clarified or 
identified, there are still many unknown matters for the 

20 relationship between the structure and the function with 
proteins. Therefore, it is still difficult to surely impart an 
intended function to the targeted protein. 

In order to overcome these difficulties, a process 
comprising random mutation and screening and also evolutional 

25 molecular engineering that utilizes the evolution of organisms 
have been being highlighted and said to be extremely useful 
(Proc. Natl. Acad. Sci. , USA, 83., 576 (1986)). However, the 
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current methods are directed to the substitution of at most 
several amino acids. 

In W095/22625, disclosed is a method for forming novel 
genes by dividing a plurality of genes at random and 
5 homologously recombining them to reconstruct novel genes. 
However, this is one method for forming chimera genes. The 
genes to be formed by this method are similar to the original 
genes, and the former shall have the essential base sequences 
of the latter. 

10 Using such known methods, it is difficult to desire the 

impartation of some additional functions to organisms which 
they could not gain during the steps of their evolution. In 
order to obtain genetic products, of which the functions are 
greatly different from those of naturally-existing substances 

15 such as proteins, it is believed effective to prepare a pool of 
nucleic acids having significantly different base sequence 
spaces from those existing naturally, and to produce from them 
genetic products having the intended functions. 

One method for this may be to prepare a nucleic acid pool 

20 that covers all base combinations. However, even the total 
number of the base sequences that may code for a relatively 
small protein with 100 amino acids (3 00 bp) is an enormous 

number of 4 300 (about 10 180 ) , and it is in fact impossible to 
prepare the nucleic acid pool that may cover all of them. 
25 For proteins of some kinds, their sub-structures which are 

referred to as modules were specifically noted, and an attempt 
was made to change the sequencing of the base sequence blocks 
corresponding to the individual modules to thereby produce 
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mutants having different module sequences (Viva Origino, Vol. 
2.3, No. 1 (1995) 86-87). In this attempt, however, the base 
sequences were re-sequenced merely individually for the 
individual mutants. No one has heretofore attempted the 
5 formation of a nucleic acid pool covering all re-sequenced 
molecules and the collection of genes capable of expressing 
products having intended properties from the pool. 

The subject matter of the present invention is to provide 
a method for efficiently obtaining base sequences that exist in 

10 spaces greatly different from those of naturally-existing base 
sequences, and also to provide genetic products to be obtained 
by expressing, as genes, the nucleic acid sequences that are 
obtained in that manner and that do not exist naturally. 

The sequence space of a gene includes the full-length 

15 sequence thereof to be theoretically constituted by a 
combination of four bases, A, G, C and T. For example, a base 
sequence that codes for a protein composed of a number "n" of 
amino acids shall be constructed by selecting and sequencing 
any desired one of the four bases for a total of 3n-times, 

20 therefore including 4 3n combinations. Accordingly, a protein 
composed of 100 amino acids shall include different base 
sequences of about 10 18 types as so mentioned hereinabove. 

In fact, there is no limitation for the number of amino 
acids that constitute proteins. Therefore, the sequencing 

25 spaces for proteins shall extend unlimitedly. During the steps 
of evolution of organisms, only a part of such sequencing 
spaces have been examined, and there is a great probability 
that some sequences coding for proteins which may have some 
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extremely excellent functions could exist in the other great 
sequencing spaces. The protein engineering studies which have 
been and are being made in many laboratories and institutes at 
present are essentially directed to the creation of novel 
5 proteins having functions superior to those of naturally- 
existing proteins, and one essential approach made therein to 
this purpose is to substitute amino acids in existing 
sequences, as so mentioned hereinabove. However, the amino 
acid substitution is nothing but the essential means that 

10 organisms have carried out during the steps of their evolution 
or, that is, such is the imitation of organisms and is to 
search only around the sequences that organisms already 
examined. In addition, there is a probability that the 
sequences thus obtained will be those that were already weeded 

15 out in the past. 

We, the present inventors have considered that, in order 
to be greatly apart from the sequencing spaces that organisms 
already examined, if we carry out such matters that could not 
have been carried out by organisms, the purpose will be 

20 attained. We know that the division of a gene into several 
blocks followed by the change in the sequencing of the thus- 
divided blocks, if occurred in organisms, shall kill the 
organisms. Therefore, we have concluded that this method is 
suitable for our purpose. Having thus concluded, we, the 

25 present inventors have assiduously studied various matters 
relating to this method and, as a result, have succeeded in the 
finding of base sequences which are significantly apart from 
naturally-existing base sequencing spaces and also in the 
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formation of a molecule pool that: covers such base sequences, 
and thus have completed the present invention. 

Accordingly, the present invention provides a specific 
nucleic acid pool that is mentioned below , a method for 
5 producing it to be mentioned below, and also a genetic product 
to be obtained by expressing, as a gene, the nucleic acid 
existing in the nucleic acid pool, as is mentioned below, 

1) A nucleic acid pool comprising two or more different 
nucleic acid, which is constructed by dividing all or a part of 

10 one or more genes into 3 or more blocks followed by ligating 
all or a part of these blocks into sequences that are different 
from the sequence or sequences of the original, non-divided 
gene or genes • 

2) The nucleic acid pool according to the previous 1) , 
15 which contains 10 or more different nucleic acids. 

3) The nucleic acid pool according to the previous 1) or 
2) , which contains all the nucleic acids with different 
sequences as constructed by re-sequencing a number, n, of said 
blocks (where n is the number of the different blocks as formed 

20 by the division) . 

4) The nucleic acid pool according to any one of the 
previous 1) to 3) , wherein the gene is a gene coding for a 
protein, and the amino acid sequence as encoded by each block 
is the same as the amino acid sequence as encoded by the 

25 corresponding part on the original gene. 

5) The nucleic acid pool according to any one of the 
previous 1) to 4) , wherein the gene is a gene coding for an 
enzymatic function or a control gene for it. 
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6) The nucleic acid pool according to the previous 5) , 
wherein the gene is a gene coding for any one of proteases, 
lipases, cellulases, amylases, catalases, xylanases, oxidases, 
dehydrogenases, oxygenases and reductases. 
5 7) The nucleic acid pool according to any one of the 

previous 1) to 6) , wherein the gene is one derived from 
prokaryotes . 

8) The nucleic acid pool according to the previous 7) , 
wherein the gene is one derived from bacillus bacteria, 
10 9) The nucleic acid pool according to the previous 8) , 

wherein the gene is a protease API21 gene. 

10) The nucleic acid pool according to any one of the 
previous 1) to 9) , wherein each block is an oligonucleotide, 

11) The nucleic acid pool according to the previous 10) , 
15 wherein the nucleic acid is a single-stranded polynucleotide, 

12) The nucleic acid pool according to the previous 10) , 
wherein the nucleic acid is a double-stranded polynucleotide. 

13) A method for producing a nucleic acid pool comprising 
two or more different nucleic acids, which comprises dividing 

20 all or a part of one or more genes into three or more 
oligonucleotide blocks or synthesizing oligonucleotides 
corresponding to said blocks, followed by ligating all or a 
part of these blocks into sequences that are different from 
those on genes. 

25 14) The method for producing a nucleic acid pool 

according to the previous 13) , which comprises the following 
steps a) to c) : 



WO 98/05764 



7 



PCT7DK97/00316 



a) a step of preparing 3 or more blocks of single-stranded 
oligonucleotides having base sequences that correspond to all 
or a part of one or more genes through division of one or more 
genes or through synthesis of oligonucleotide chains having 

5 said base sequences; 

b) a step of adding a ribonucleotide to its 3' -terminal of 
each with a deoxyribonucleotide at the 3' -terminal of the 
oligonucleotide chain blocks as obtained in the previous step 

a) , while adding a phosphoryl group to its 5 1 -terminal of each 
10 thereof with a hydroxyl group at the 5 '-terminal; and 

c) a step of ligating in any desired sequence the 
oligonucleotide chain blocks as obtained in the previous step 

b) , by reacting the 3 '-terminal ribonucleotide of one block 
with the 5 •-terminal phosphoryl group of another block. 

15 15) The method for producing a nucleic acid pool acording 

to the previous 14) , wherein the number of the blocks to be 
prepared in the step a) is 3 or more. 

16) The method for producing a nucleic acid pool 
according to the previous 14) or 15), wherein at least one 

20 block is left to still have its 5 f -terminal hydroxyl group in 
the step b) to thereby selectively obtain a nucleic acid or 
nucleic acids having said block at the 5' -terminal. 

17) The method for producing a nucleic acid pool 
according to any one of the previous 14) to 16) , wherein at 

25 least one block is left to still have its 3 '-terminal 
deoxyribonucleotide in the step b) to thereby selectively 
obtain a nucleic acid or nucleic acids having said block at the 
3 1 -terminal . 
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18) The method for producing a nucleic acid pool 
according to any one of the previous 14) to 17) , wherein the 
blocks are prepared in such a manner that the amino acid 
sequence as encoded by each block is the same as the amino acid 

5 sequence as encoded by the corresponding part on the original 
gene. 

19) The method for producing a nucleic acid pool 
according to the previous 18) , wherein blocks each having a 
base sequence of from the (3p+l)th to the (3q+2)th, as counted 

10 from the starting point of the reading frame on a gene (where p 
and q are integers to be independently determined for each 
block, provided that p, q) , are prepared in the step a) , and a 
ribonucleotide that corresponds to the (3q+3)th base, as 
counted in the same manner as above, or corresponds to a base 

15 that does not change the amino acid to be encoded at said site 
is added to each said block at its 3' -terminal, 

20) The method for producing a nucleic acid pool 
according to any one of the previous 14) to 19) , wherein the 
ligation of the step c) is conducted, using an RNA ligase in 

20 the presence of adenosine triphosphate and divalent metal ions. 

21) A method for producing a double-stranded nucleic acid 
pool, which comprises converting the single-stranded nucleic 
acids as obtained in any one of the previous 14) to 2 0 into 
double-stranded ones through polymerase reaction. 

25 22) A genetic product to be obtained by expressing the 

genetic information that exists in the nucleic acid pool of any 
one of the previous 1 to 12). 
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Now, the present invention is described in more detail 
hereinunder • 

Nucleic Acid Pool 

5 The "nucleic acid pool" as referred to herein means a 

high-density mixture of two or more different nucleic acids. 
Nucleic acids are single-stranded or double-stranded poly- 
ucleotides. The nucleic acid pool of the present inention can 
cover a specific number or more, for example, 10 or more 

10 different nucleic acid molecules having different structures. 
It is desirable that, when the mixture, nucleic acid pool is 
directly used in biochemical operation or reaction, it is in 
such a form that all the plural nucleic acid components 
constituting it can be reacted. However, the form of the 

15 mixture, nucleic acid pool is not specifically defined, and the 
nucleic acid pool may be either in solution or dry mixture. 

The nucleic acid pool of the present invention is 
characterized in that it is constructed by dividing all or a 
part of one or more genes into 3 or more blocks followed by 

20 ligating all or a part of these blocks into sequences that are 
different from the sequence or sequences of the original, non- 
divided gene or genes, and is therefore characterized in that 
it comprises a plurality of different nucleic acids having base 
sequences that are different from the original, non-divided 

25 base sequence or sequences. The step of "ligating all or a 
part of the divided blocks into sequences that are different 
from the original, non-divided sequence or sequences" as 
referred to herein includes (i) re-sequencing of the blocks in 
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a sequence that is different from the original, non-divided se- 
uence, (ii) ligation of a plurality of the same blocks 
continuously or discontinuous ly, (iii) re-ligation of the 
blocks except at least one block, and (iv) combination of these 
5 (i) to (iii) . The operation for dividing a gene into plural 
blocks and re-sequencing these in any desired order that is 
employed in the present invention is hereinafter referred to as 
"shuffling". 

For example, where one DNA has a sequence composed of a 
10 number, n, of blocks, as represented by a formula (1) : 
A - al - a2 - . . . . - a n - B (9) 
wherein the starting end A and/ or the terminal end B may 
be omitted, 

this may be shuffled according to the invention to give a 
15 mixture of nucleic acids to be represented by a formula (2) : 
A - al' - a2' - - . . . - a x - B (2) 

wherein al 1 , a2 1 , . • . , a x are blocks that are 
independently selected from the group of al, a2, . . . , 
a n ; and the total number of the blocks al 1 , a2 1 , . . . , 
20 a x may not be the same as the total number of the blocks 

al, a2, • « • , a . 

In order to make the shuffling effective, one or more 
genes must be divided into 3 or more blocks. If divided into 2 
25 blocks, only one re-sequenced form can be obtained and many 
different nucleic acids cannot be obtained. If so, the effec- 
tiveness of the nucleic acid pool of the invention is poor. 
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Preferably, one or more genes are divided into 5 or more 
blocks. 

The blocks (such as al, a2, . . . . , a n in the above- 
mentioned formula (1)), which are the units to be shuffled, are 
5 oligonucleotides or polynucleotides composed of 2 or more 
nucleotides (hereinafter referred to as "oligonucleotides") . 
If the length of each block is too short, the operation with 
the blocks is complicated. In general, therefore, each block 
is preferably composed of 21 or more nucleotide units, more 

10 preferably 45 or more nucleotide units. The uppermost limit of 
the block length is not specifically defined, provided that the 
block length is shorter than the length of one gene. If, how- 
ever, the block length is too large, the re-sequenced nucleic 
acids to be obtained shall have many non-mutated base sequence 

15 parts. Therefore, in general, the block length is preferably 
within the range of from 5 to 30 % of the length of a gene. 

The division of a gene into blocks may be effected at any 
sites of the gene. Though not excluding the division of a gene 
into the constitutive exons or segment blocks that correspond 

20 to the domains or modules of the protein which the gene codes 
for, there is a probability that the shuffling at such sites 
would have been examined in the natural world in the past. In 
order to obtain base sequences that have not heretofore been 
examined in the natural world, it is desirable that the 

25 division of a gene is effected inside the constitutive exons or 
at the sites corresponding to the inside of the domains or 
modules of the protein which the gene codes for. 
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During the shuffling of a gene, especially during the 
ligation of the divided blocks thereof, it is possible to 
introduce any oligonucleotide blocks which the original gene 
does not have, and also to insert or delete nucleotides. 
5 Needless-to-say , where the gene to be shuffled is a gene that 
codes for a protein, it is desirable that the gene blocks, 
oligonucleotides each have the same reading frame before and 
after the division of the gene. Namely, it is desirable that 
the gene blocks to be shuffled are so designed that they are 

10 translated to always give the corresponding amino acid 
sequences, irrespective of their relative positions in the 
shuffled sequence. Employing such means, it is possible to ob- 
tain proteins which have different structures as a whole from 
those of natural proteins but which partly contain amino acid 

15 sequences that have been confirmed to be useful in the natural 
world. Accordingly, the probability of obtaining useful pro- 
teins by such means is enlarged, as compared with the means of 
synthesizing proteins totally at random. 

The re-sequencing of the divided blocks to be conducted 

20 through the shuffling thereof in the present invention is to 
ligate a desired number of the blocks, al, a2 , . . . , an, 
while allowing the ligation of two or more same blocks in 
series and allowing the deletion of some blocks, as so 
mentioned hereinabove. It is desirable that the nucleic acid 

25 pool of the invention to be obtained by the ligation covers at 
least all nucleic acids each composed of nearly the same number 
of blocks as the number of the divided blocks. For example, 
when a gene is divided into 5 blocks, al, a2 , a3 , a4 and a5, it 
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is desirable that the nucleic acid pool obtained covers 
substantially all different combinations each comprised of 
these 5 blocks • 

Precisely, it is desirable that the nucleic acid pool 
5 obtained covers all simple re-sequences, such as al-a3-a2-a4-a5 
(where the order of a2 and a3 was altered) and al-a4-a2-a3-a5 
(where s2 , a3 and a4 were re-sequenced) , more preferably 
complex re-sequences comprising a plurality of same blocks, 
such as al-a3-a2-al-a5, in addition to such simple re- 
10 sequences. Relative to the number, n, of divided blocks, the 
number of the former simple re-sequences is n! , while that of 

the latter complex re-sequences is n n . That is, for 5 blocks, 
the number of the former is 5! (5x4x3x2x1) of 12 0, 
while that of the latter is 5 5 of 3125. Accordingly, the 

15 nucleic acid pool as obtained by shuffling a gene according to 
the present invention thus can cover such an extremely large 
number of nucleic acid molecules having different base 
sequences, each of which is different from the base sequence of 
the original gene. 

20 The kind of the gene to be shuffled is not specifically 

defined. Employable herein is any and every gene that is 
composed of polynucleotide chains and contains a coding region 
necessary for expressing a protein or RNA. The nucleotide unit 
may contain any molecule of deoxyribonucleotides or 

25 ribonucleotides. For the purpose of finding out useful base 
sequences, preferred are genes coding for proteins, especially 
enzymes, or control genes for enzymatic functions. Examples of 
such enzymes include proteases, lipases, cellulases, amylases, 



WO 98/05764 



14 



PCT/DK97/00316 



catalases, xylanases, oxidases, dehydrogenases, oxygenases and 
reductases . 

The kind of the gene to which the present invention is 
directed is not specifically defined but shall be such that, 
5 when it is introduced into a suitable host, the host can 
produce the genetic product through expression of the gene. As 
examples, referred to are genes as cloned from living 
organisms, artificially synthesized genes, and even genes as 
cloned from living organisms and artificially mutated. For the 

10 genes derived from living organisms, employable are prokaryotes 
with definite enzyme producibility . As examples of such 
prokaryotes, mentioned are bacillus bacteria • One example of 
the genes derived from such bacteria is a protease API21 gene 
derived from Bacillus NKS-21 (FERM BP-93-1) (Japanese Patent 

15 Application Laid-Open No. 5-91876, Sequence Number 1) . 

Method for Producing Nucleic Acid Pool 

The present invention also provides a method for producing 
a nucleic acid pool comprising two or more different nucleic 

20 acids each having a base sequence that is different from the 
base sequence of the original, non-divided gene, which 
comprises dividing all or a part of one or more genes into 
three or more oligonucleotide blocks, followed by ligating all 
or a part of these blocks into sequences that are different 

25 from the sequence or sequences of the original, non-divided 
gene or genes. 

The division of a gene into blocks can be conducted by any 
desired method that satisfies the above-mentioned conditions 
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necessary to the nucleic acid pool. For example, the division 
of a gene can be conducted by the use of restriction enzymes* 
For this, any desired restriction enzymes can be used, 
including, for example, EcoRl, Hindlll, BamHI , PstI, Kpnl, 
5 Xbal, Smal, SacI, Clal, Alul, Haelll and Rsal. If a gene 
having a known base sequence is shuffled, each block of the 
gene can be obtained through synthesis in accordance with the 
above-mentioned conditions. To re-sequence these blocks, they 
are blended and ligated, for example, using a ligase. 
10 Now, preferred methods for producing a single-stranded 

nucleic acid pool and a double-stranded nucleic acid pool are 
described in detail hereinunder. 

(1) Method for Producing Single-Stranded Nucleic Acid Pool: 

One preferred method of producing a single-stranded 
15 nucleic acid pool of the present invention is to ligate plural 
blocks each with a ribonucleotide at the 3* -terminal, using an 
RNA ligase. This method comprises the following steps a) to 
c) : 

a) a step of preparing 3 or more blocks of single-stranded 
20 oligonucleotides having base sequences that correspond to all 

or a part of one or more genes through division of one or more 
genes or through synthesis of oligonucleotide chains having 
said base sequences; 

b) a step of adding a ribonucleotide to its 3 '-terminal of 
25 each with no ribonucleotide at the 3 1 -terminal of the 

oligonucleotide chain blocks as obtained in the previous step 
a), while adding a phosphoryl group to its 5 1 -terminal of each 
thereof with no phosphoryl group at the 5 1 -terminal; and 
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c) a step of ligating in any desired sequence the 
oligonucleotide chain blocks as obtained in the previous step 
b) , by reacting the 3 1 -terminal ribonucleotide of one block 
with the 5 1 -terminal phosphoryl group of another block. 
5 In Fig. 1, schematically illustrated are the above- 

mentioned steps for shuffling one gene. In this embodiment 
illustrated, one gene is divided into four blocks (al, a2 , a3 , 
a4) . To simplify the explanation on these steps, the base and 
the nucleic acids are represented only by the corresponding 
10 base sequences. A f G, C and T are nucleotide units comprising 
the corresponding bases. rG means GMP; and (P) and (OH) mean 
the phosphoryl group and the hydroxy 1 group, respectively, 
existing at the terminals of each nucleotide chain. 

15 Step a) 

In the step a) , the division of the gene can be effected, 
using restriction enzymes. After the division, the divided 
blocks are denatured under heat or with an alkaline or the like 
into single-stranded oligonucleotide. Where the sequence of 
20 the gene is known, single-stranded oligonucleotides are 
synthesized using ordinary devices and according to ordinary 
methods . 

Step h) 

25 Where the blocks as obtained in the step a) each have a 

3 1 -terminal deoxyr ibonucleot ide , a ribonucleotide is added to 
the 3' -terminal (step b) . This addition can be effected by 
reacting a terminal deoxynucleotidyl transferase on each said 
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block in the presence of a nucleoside triphosphate (ATP, GTP, 
CTP, UTP) . The ribonucleotide thus added (AMP , GMP, CMP, UMP) 
includes the base corresponding to the nucleoside triphosphate 
used (A, G, C, U) . Accordingly, selecting the nucleoside 
5 triphosphate to be used, each block may have a desired 3'- 
terminal ribonucleotide* In the embodiment as illustrated in 
Fig. 1, GMP (this is represented by rG underlined in Fig. 1) is 
added to the block mixture. However, if the blocks are 
separately obtained, for example, by separately synthesizing 

10 these, different ribonucleotides can be added to these. The 
nucleoside triphosphate is used in an amount of from 2 to 10 
times or so, by mol, relative to mol of each block. The 
reaction temperature may be from 3 0 to 4 0 2C or so; and the 
reaction time may be from 3 0 minutes to 2 hours or so. 

15 In the step b) , a phosphoryl group is added to the 5'- 

terminal of each block. This addition can be effected, using a 
polynucleotide kinase in the presence of ATP. ATP is used in 
an amount of from 2 to 10 times or so, by mol, relative to mol 
of each block. The reaction temperature may be from 3 0 to 4 0gc 

20 or so; and the reaction time may be from 10 minutes to 1 hour 
or so. The pH is most suitably from 7 to 9 or so. 

Step c) 

The ligation of oligonucleotide chain blocks in the step 
25 c) can be effected by reacting an RNA ligase on the mixture of 
blocks thus obtained in the previous step, in the presence of 
ATP and divalent metal ions (Japanese Patent Application Laid- 
Open No. 5-292967) . Useful divalent ions are magnesium ions 
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and manganese ions, of which preferred are magnesium ions. As 
the ligase, employable is an RNA ligase. The RNA ligase is an 
enzyme that catalyzes the ligation of a 5 1 -phosphoryl 
terminated polynucleotide and a 3 1 -hydroxyl terminated 
5 polynucleotide. The substrate for such an RNA ligase is 
naturally an RNA, but the enzyme can effectively catalyze the 
ligation of a 5 1 -phosphoryl terminated polydeoxyribonucleotide 
and a polydeoxyribonucleotide having a ribonucleotide only at 
its 3 1 -terminal. Preferably used herein is a T4 RNA ligase. 

10 The reaction is conducted generally in a buffer, at a pH of 
from 7 to 9 and at a temperature of from 10 to 40ec over a 
period of from 30 to 180 minutes. For example, the 

oligonucleotides may be reacted in a solution comprising 50 mM 
Tris-HCl (pH 8.0), 10 mM MgCl2, 0.1 mM ATP, 10 mg/liter BSA, 1 

15 mM hexaammine cobalt chloride (HCC) and 2 5 % polyethylene 
glycol 6000, at 252C for 60 minutes or longer. 

Controlling of Reading Frame 

If the blocks as prepared in the step a) are not n-times 

20 (n: integer) the codon units, or if nucleotides are inserted 
into or deleted from blocks in the step b) , the amino acid 
sequences to be encoded by the blocks vary, depending on the 
shuffled sites of the blocks. In some cases, however, it is 
often desirable that the shuffling does not result in the 

25 change in the amino acid sequence to be encoded by each block, 
as so mentioned hereinabove. For this purpose, a modified 
method as schematically illustrated in Fig. 2 will be 
effective. 
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In the modified method illustrated, blocks each having a 
base sequence of from the (3p+l)th to the (3q+2)th, as counted 
from the starting point of the reading frame on a gene (where p 
and q are integers to be independently determined for each 
5 block, provided that p <V q) , are prepared in the first step 
(step a'), and a ribonucleotide that corresponds to the 
(3q+3)th base, as counted in the same manner as above, or a 
ribonucleotide corresponding to a base that does not change the 
amino acid to be encoded at said site is added to each said 

10 block at its 3* -terminal in the step b 1 ) 

In the embodiment illustrated in Fig. 2, block al (p = 0; 
q = 2) , a2 (p = 3; g = 5) , a3 (p = 6; q = 9) and a4 are 
prepared from the gene to be shuffled in the step a') (if 
desired, these may be divided and isolated). To the block a4 , 

15 a ribonucleotide is not added at its 3* -terminal for the 
reasons mentioned below. 

Next, GMP (this is represented by rG underlines in Fig. 2) 
is added to al and a2 , while AMP (this is represented by rA 
underlined in Fig. 2) is added to a3 , in the next step b») The 

20 addition of such ribonucleotides can be effected in the same 
manner as in the above-mentioned step b) , using a nucleoside 
triphosphate and a terminal deoxynucleotidyl transferase (TDT) . 
Alternatively, employable is a method of preparing 
ribonucleotide-terminated blocks only. As a result of this 

25 step, the amino acid sequences to be encoded by these blocks 
shall be the same as those on the original gene. 

After this, the blocks are phosphorylated with a 
polynucleotide kinase (PNK) in the same manner as in the above- 
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mentioned step b) (refer to the latter half of the step b) ) . 
Next, in the step C 1 ), these blocks are re-sequenced and 
ligated in the same manner as in the above-mentioned step c) . 

5 Specific Determination of Terminal Sequence 

As in Fig. 2, some blocks (a4 in this embodiment) may not 
be processed with a ribonucleotide at the 3 '-terminal. If the 
blocks not processed so are treated with an RNA ligase, any 
other block could no more be ligated to these blocks at the 3'- 

10 terminal. As a result, all the nucleic acids in the nucleic 
acid pool obtained shall have substantially any of these blocks 
at the 3 1 -terminal. In the same manner, if some particular 
blocks are not phosphorylated at the 5 1 -terminal, all the 
nucleic acids in the nucleic acid pool obtained shall have 

15 substantially any of such specific blocks at the 5' -terminal. 

Employing this method, it is possible with ease to prepare 
a nucleic acid pool comprising nucleic acids which have 
predetermined particular blocks positioned at the terminals 
while having random re-sequences in the intermediate part. 

20 This method is especially advantageous in producing protein 
mutants having particular amino acid sequences at the terminals 
or for the purpose of expressing particular control functions. 

(2) Method for Producing Double-Stranded Nucleic Acid Pool 
25 The molecules as obtained in the process mentioned in the 

previous (1) are single-stranded ones, which can be converted 
into double-stranded ones through genetic treatment thereof to 
be mentioned below. For this, the block mixture is made to 



WO 98/05764 



21 



PCT/DK97/00316 



contain a block having a 5 1 -phosphoryl group but not having a 
3 1 -ribonucleotide group (this block is referred to as 
oligonucleotide A; in Fig. 2, a4 corresponds to this block). 
Accordingly, all the nucleic acids that constitute the nucleic 
5 acid pool to be produced shall be substantially terminated by 
oligonucleotide A at the 3 1 -terminal , as so mentioned in the 
last in the previous (2) . Next, the nucleic acid blocks are 
subjected to ordinary DNA-extending reaction, using, as a 
primer, a decamer (10-mer) or higher oligonucleotide, 
10 preferably a heptamer (17-mer) or higher oligonucleotide, that 
is complementary to oligonucleotide A. For this, employable is 
any and every enzyme that catalyzes the DNA-extending reaction, 
such as Taq polymerase, Klenow fragment, DNA polymerase I or 
the like. 

15 For this purpose, also employable is PCR (polymerase chain 

reaction) . If PCR is employed, an additional oligonucleotide 
having a 3 1 -ribonucleotide group but not having a 5 1 -phosphoryl 
group (hereinafter referred to as oligonucleotide B) is added 
to the block mixture, in addition to the above-mentioned 

20 oligonucleotide A, during the process of preparing the pool. 
Accordingly, all the molecules that constitute the pool shall 
have oligonucleotide A at the 3 '-terminal and oligonucleotide B 
at the 5 f -terminal. 

After this, the nucleic acid blocks are subjected to PCR, 

25 using, as primers, a 10-mer or higher oligonucleotide, 
preferably a 17-mer or higher oligonucleotide, that is 
complementary to oligonucleotide A, and a 10-mer or higher 
oligonucleotide, preferably a 17-mer or higher oligonucleotide, 



WO 98/05764 



22 



PCT7DK97/00316 



that is complementary to oligonucleotide B, whereby the nucleic 
acid blocks are converted into double-stranded ones while being 
amplified at the same time. Therefore, this process is 
advantageous for the following operation. 
5 The oligonucleotide A and/ or B may be the same as those 

existing on the original gene, or, if desired, may also be 
others which the original gene does not have. 

Expression of Genetic Information in Nucleic Acid Pool 

10 The resulting double-stranded nucleic acid is blunted, and 

then ligated to any desired vector, preferably an expression 
vector, such as pKK223-3, using a DNA ligase. If desired, the 
polynucleotide A and B positioned at the both terminals of the 
nucleic acid may be made to have suitable restriction enzyme 

15 recognizing sites. In this case, the nucleic acid may be liga- 
ted to a suitable vector, using the defined restriction 
enzymes . 

Next, the vector library thus produced in the manner 
mentioned above is introduced into a suitable host, in which 

20 the genetic information is expressed. Thus, the intended gene- 
tic product with favorable properties and also the gene coding 
for it can be obtained. Any and every ordinary host can be 
used herein. Preferred examples of the host include cells of 
E . coli, bacillus bacteria, yeasts, and lactic acid bacteria. 

25 If desired, ln-vitro transcription systems and translation 

systems are also employable herein. In those cases, the gene- 
tic information can be expressed even when the nucleic acid is 
not ligated to a vector. 
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The "genetic information" as referred to herein indicates 
the information on a gene which is carried by a DNA or RNA and 
which is translated into a protein or is transcribed into RNA 
in a suitable living body by the DNA or RNA for itself or after 
5 having been ligated to any other DNA or RNA. 

The genetic information that is expected to be expressed 
according to the method of the present invention is not 
specifically defined, but includes, for example, those on 
various genetic products, such as enzymes, antibodies, hormones 
10 receptor proteins and ribozymes, and those on various control 
functions of, for example, operators, promoters and atte- 
nuators • 

Examples 

15 Now, the present invention is described more concretely 

hereinunder with reference to the following examples, which, 
however, are not intended to restrict the scope of the present 
invention. 

Example 1: Production of Sinale-Stranded Nucleic Acid Pool 
20 A nucleic acid pool was produced in accordance with the 

process mentioned below, based on the wild-type alkali protease 

(Japanese Patent Application-Laid Open No. 5-91876) as cloned 

from a protease API21 (Bacillus NKS-21; FERM BP-93-1) having a 

sequence of Sequence Number 1. 
25 (1) Step a) : Preparation of Oligonucleotide Blocks 

Using an automatic DNA synthesizer, Model 3 92 

(manufactured by Perkin Elmer Co.), synthesized were the 

following 5 oligonucleotides. 
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® Oligo A (Sequence Number 2; this corresponds to the 
base sequence of from 4 3 6th to 4 55th in Sequence Number 1) 
® Oligo 1 (Sequence Number 3; from 457th to 503rd in 
Sequence Number 1) 
5 © Oligo 2 (Sequence Number 4; from 505th to 551st in 

Sequence Number 1) 

® Oligo 3 (Sequence Number 5; from 553rd to 596th in 
Sequence Number 1) 

© Oligo B (Sequence Number 6; from 598th to 618th in 
10 Sequence Number 1) 

Oligo A, oligo 1 to 3, and oligo B are parts of the 
protease API21 gene. Their positions are as mentioned above. 
These oligonucleotides were synthesized in a DM trityl-on 
condition (that is, while the 5 1 -hydroxy 1 group was protected 
15 with dimethoxytrityl group) , and purified through an OPC 
column. The reagents used herein were obtained from Perkin 
Elmer Co. 

(2) Step b: Processing of Oligonucleotide Blocks 
20 (2-1) Addition of Ribonucleotide: 

500 pmols of oligo A, 1 nmol of UTP and 10 units of 
terminal deoxynucleotidyl transferase were added to a standard 
solution comprising: 

50 mM Tris-HCl buffer (pH 8.0) 
25 10 mM MgCl2 

5 mM DTT (dithiothreitol) 
25 % PEG 6000 

1 mM HCC (hexaammine cobalt chloride) 
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10 fig /ml BSA (bovine serum albumin) , 
to thereby make 10 /xl in total. The resulting solution was 
left at 3 7QC for 1 hour. 

Oligo 1 and oligo 2 were processed in the same manner as 
5 above. Oligo 3 was processed in the same manner as above , but 
using ATP in place of UTP. These oligonucleotide blocks to 
which had been added 3* -terminal ribonucleotide through the 
above-mentioned operation are referred to as oligo Ar, oligo 
lr, oligo 2r and oligo 3r. 

10 

(2-2 ) Phosphorylation: 

500 pmols of oligo lr, 1 nmol of ATP and 10 units of 
polynucleotide kinase were dissolved in the standard solution 
having the same composition as above to make 10 jxl in total. 
15 The resulting solution was left at 37 ec for 1 hour. Oligo 2r, 
oligo 3r and oligo B were processed in the same manner as 
above. These polynucleotides thus formed are referred to as 
oligo lpr, oligo 2pr, oligo 3pr and oligo Bp. 

20 (3) Step c) : Ligation of Oligonucleotide Blocks: 

500 pmols of oligo Ar, 500 pmols of oligo Ipr, 500 pmols 
of oligo 2pr, 500 pmols of oligo 3pr, 500 pmols of oligo Bp, 
that had been prepared in the previous step, and also 1 nmol of 
ATP and 50 units of T4 RNA ligase were dissolved in the 

25 standard solution having the same composition as above to make 
10 ill in total. These were thus reacted at 2 5^c for 4 hours. 
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Subsequently, the reaction mixture was subjected to 
polyacrylamide gel electrophoresis, through which were 
collected fragments of about 180 bp from the gel. 

As a result, obtained was a single-stranded nucleic acid 
5 pool in which oligo lpr, oligo 2pr and oligo 3pr were ligated 
in random sequences between oligo Ar and oligo Bp. 

Example 2: Production of Double-Stranded Nucleic Acid Pool 

Oligo B' (its sequence is represented by Sequence Number 

10 7) which is complementary to oligo B was synthesized in the 
same manner as in Example 1. 

The DNAs ' constituting the single-stranded nucleic acid 
pool as obtained in Example 1 were all or, that is, without 
being separated into the individual DNAs, mixed with 10 pmols 

15 of oligo B', and added to tris-HCl buffer containing MgCl2 and 
DTT (dithiothreitol) to make 20 jul in total. The resulting 
mixture was finally comprised of 10 mM tris-HCl buffer (pH 
7.5), 7 mM MgCl2, and 0.1 mM of DTT. This was heated at 752C 
for 5 minutes, and then cooled to 30ec. Next, l unit of Klenow 

20 fragment was added thereto, and kept at 3 7 ec for 2 hours, 
whereby the single-stranded nucleic acids were converted into 
double-stranded ones . 

As a result, obtained was a double-stranded nucleic acid 
pool in which oligo lpr, oligo 2pr and oligo 3pr were ligated 

25 in random sequences between oligo Ar and oligo Bp. 
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Example 3 : Transfo rmation of E. coli with Nucleic Acid Pool 

A plasmid pSDT812 (Japanese Patent Application Laid-Open 
No. 1-141596) , which has been prepared by inserting a gene of a 
5 wild-type alkali protease as cloned from Bacillus NKS-21 into a 
plasmid pHSG396 at its Clal-cleaving site, was digested with a 
restriction enzyme Clal, and then blunted with a commercially- 
available blunting reagent (Blunting Kit, manufactured by 
Takara Shuzo Co.)- This was mixed with a plasmid pHY300PLK 

10 (manufactured by Yakult Honsha Co.), which had been digested 
with restriction enzymes EcoRI and Hindlll, then blunted in the 
same manner as above and processed with an alkali phosphatase, 
and these were ligated using a commercially-available ligation 
kit (manufactured by Takara Shuzo Co.). 

15 Using the resulting DNA, cells of E. coli JM105 were 

transformed, and tetracycline-resistant transf ormants were 
selected. From these transf ormants , the plasmid DNA was 
extracted, purified and analyzed. Thus was obtained a plasmid 
PHY812 (Fig. 3), in which pHY300PLK was ligated to the wild- 

20 type alkali protease. 

Next, pHY812 formed in the above was digested with 
restriction enzymes Hindlll and SphI, and then processed with 
BAP (a) . On the other hand, the DNAs constituting the double- 
stranded nucleic acid pool as formed in Example 3 were digested 

25 all at a time with restriction enzymes Hindlll and SphI (b) . 
(a) and (b) were ligated in the same manner as above, using the 
ligation kit. With the resulting DNA, cells of E. coli JM105 
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were transformed- The resulting transf ormants were screened on 
an L-plate containing ampicillin. 

Example 4 

5 From the colonies that had been selected in Example 3, a 

plasmid DNA was prepared, which was then digested with Hindlll 
and SphI to check as to whether or not it gave a fragment of 
about 160 bp after the digestion. The base sequences of 95 
clones that had given the fragment having the intended length 
10 were analyzed, which verified the shuffling of the gene blocks 
corresponding to oligo 1, oligo 2 and oligo 3. Table 1 shows 
different types of shuffling, and the number of clones with 
each type . 
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Table 1 



Type of Shuffling 


Number 
of 

Clones 


Type of Shuffling 


Number of 


111 


3 


£» «J 




112 


4 


*S J X 


o 


113 


2 


4b J 45 




121 


5 


<SJ J 


3 


122 


3 


111 
Jil 




123 






p: 
D 


131 


2 


313 


3 


132 


7 


321 


7 


133 


3 


322 


3 


211 


2 


323 


4 


212 


3 


331 


2 


213 


5 


332 


5 


221 


2 


333 


1 


222 


1 







5 As in the above, it has been confirmed that, if three 

blocks of one gene are shuffled according to the method of the 
present invention, obtained are all combinations of clones each 
containing the same or different three of these blocks. 

10 Example 5 

The plasmid DNAs as produced in Example 4 were mixed. 
Using the resulting DNA mixture, cells of Bacillus subtilis 
UOT0999 were transformed. Tetracycline-resistant transf ormants 
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were selected. 300 trans formants were replicated on a skim 
milk-containing medium plate, on which were found clear zones 
around the colonies of 15 transf ormants . Accordingly, it is 
understood that the enzyme which the shuffled gene codes for 
5 can be selected depending on its activity. A plasmid DNA was 
prepared from these 15 transf ormants that had formed the clear 
zones, and then sequenced. From the base sequence thus 
identified, it is understood that the blocks of the plasmid DNA 
prepared herein were sequenced in the same order as in the 
10 wild-type plasmid DNA. 

Example 6 

From 10 transf ormants (one forms clear zones, while nine 
do not) as obtained in Example 5, and also from the host, 

15 Bacillus subtilis UOT0999 which does not have the plasmid, 
full-length RNAs were prepared. These were processed with a 
ribonuclease-f ree deoxyribonuclease , in order to remove the 
influence of the plasmid on the hybridization to be effected 
later on. Next, using oligo B' as the probe, these were 

20 subjected to Northern hybridization. As a result, all lanes 
corresponding to the RNA of the transf ormants gave detectable 
bands, but no band was detected on the lanes corresponding to 
the RNA of the host. 

25 [Advantages of the Invention] 

According to the present invention, it is possible to 
obtain, through simple processes, a nucleic acid pool capable 
of covering various base sequences which are substantially 
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apart froin the naturally-existing base sequence spaces. 
Therefore, it is possible to obtain excellent genetic products , 
such as proteins and enzymes, which could not be obtained in 
conventional methods and which were not examined by organisms 
5 in the past. In addition, according to the method of the 
present invention for producing a nucleic acid pool, it is 
possible to obtain a mixture of nucleic acids while optionally 
shuffling the constitutive blocks at random in the intermediate 
parts but fixing the terminal sequences to be predetermined, 

10 desired ones, and it is also possible to shuffle the 
constitutive blocks without changing the amino acid sequence 
which each block codes for. Therefore, as compared with a 
method of producing a completely-randomized nucleic acid pool, 
there is a high possibility that useful genetic products can be 

15 produced according to the method of the present invention. 
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Sequence Listing 



Sequence Number: 1 

Length of Sequence: 1122 

Type of Sequence: Nucleic Acid 

Number of Strands: Double-stranded 

Topology: Linear 

Kind of Sequence: Genomic DNA 

Source: Bacillus NKS-21 (FERM BP-93-1) 

Characteristics of Sequence: 

Code Indicating Characteristics: Sig Peptide 

Existing Site: 1 • • » 93 

Method of Determining Characteristics: S 

Code Indicating Characteristics: Mat Peptide 

Existing Site: 104 . . . 1112 

Method of Determining Characteristic: S 

Sequence : 

ATG AAT CTT CAA AAA ATA GCC TCA GCG TTG AAG GTT AAG CAA TCG GCA48 
Met Asn Leu Gin Lys He Ala Ser Ala Leu Lys Val Lys Gin Ser Ala 

-100 -95 -90 

TTG GTC AGC AGT TTA ACT ATT TTG TTT CTA ATC ATG CTA GTA GGT ACG96 
Leu Val Ser Ser Leu Thr He Leu Phe Leu He Met Leu Val Gly Thr 

-85 -80 -75 

ACT AGT GCA AAT GGT GCG AAG CAA GAG TAC TTA ATT GGT TTC AAC TCA 144 
Thr Ser Ala Asn Gly Ala Lys Gin Glu Tyr Leu He Gly Phe Asn Ser 
-70 -65 -60 -55 

GAC AAG GCA AAA GGA CTT ATC CAA AAT GCA GGT GGA GAA ATT CAT CAT 192 
Asp Lys Ala Lys Gly Leu He Gin Asn Ala Gly Gly Glu He His His 

-50 -45 -40 

GAA TAT ACA GAG TTT CCA GTT ATC TAT GCA GAG CTT CCA GAA GCA GCG 240 
Glu Tyr Thr Glu Phe Pro Val He Tyr Ala Glu Leu Pro Glu Ala Ala 

-35 -30 -25 

GTA AGT GGA TTG AAA AAT AAT CCT CAT ATT GAT TTT ATT GAG GAA AAC 2 88 
Val Ser Gly Leu Lys Asn Asn Pro His He Asp Phe He Glu Glu Asn 

-20 -15 -10 

GAA GAA GTT GAA ATT GCA CAG ACT GTT CCT TGG GGA ATC CCT TAT ATT 336 
Glu Glu Val Glu He Ala Gin Thr Val Pro Trp Gly He Pro Tyr He 
-5 15 10 
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TAC TCG GAT GTT GTT CAT 
Tyr Ser Asp Val Val His 
15 

GTA GCA GTA CTT GAT ACA 
Val Ala Val Leu Asp Thr 
30 

AGA GGA GGA GTA AGC TTT 
Arg Gly Gly Val Ser Phe 
45 

AAT GGT CAC GGT ACT CAC 
Asn Gly His Gly Thr His 
60 

TCA TAT GGC GTA TTG GGA 
Ser Tyr Gly Val Leu Gly 
75 80 
AAA GTT CTT GAT CGT AAC 
Lys Val Leu Asp Arg Asn 
95 

GGA ATT GAA TGG GCG ATG 
Gly He Glu Trp Ala Met 
110 

TTA GGA AGT CCT TCT GGG 
Leu Gly Ser Pro Ser Gly 
125 

GCT AGG AAT GCA GGT GTC 
Ala Arg Asn Ala Gly Val 
140 

CAA CAA GGC GGC TCG AAT 
Gin Gin Gly Gly Ser Asn 
155 160 
GTC ATG GCT GTT GGA GCG 
Val Met Ala Val Gly Ala 
175 

TCA AGC TAT GGA TCA GAA 
Ser Ser Tyr Gly Ser Glu 
190 

AAC AGT ACG TAT TTA AAT 
Asn Ser Thr Tyr Leu Asn 
205 

ATG GCA TCT CCA CAT GTT 
Met Ala Ser Pro His Val 
220 

CAC CCT CAC TTA ACG GCG 
His Pro His Leu Thr Ala 
235 240 
GCA ATT CCG CTT GGT AAC 
Ala He Pro Leu Gly Asn 
255 

GCT GAG TAT GCG GCT CAA 
Ala Glu Tyr Ala Ala Gin 
270 272 



CGT CAA GGT TAC TTT 
Arg Gin Gly Tyr Phe 
20 

GGA GTG GCT CCT CAT 
Gly Val Ala Pro His 
35 

ATC TCT ACA GAA AAC 
He Ser Thr Glu Asn 
50 

GTA GCT GGT ACT GTA 
Val Ala Gly Thr Val 
65 

GTG GCT CCT GGA GCT 
Val Ala Pro Gly Ala 
85 

GGA AGC GGT TCG CAT 
Gly Ser Gly Ser His 
100 

AAT AAT GGG ATG GAT 
Asn Asn Gly Met Asp 
115 

TCT ACA ACC CTG CAA 
Ser Thr Thr Leu Gin 
130 

TTA TTA ATT GGG GCG 
Leu Leu lie Gly Ala 
145 

AAC ATG GGC TAC CCA 
Asn Met Gly Tyr Pro 
165 

GTG GAC CAA AAT GGA 
Val Asp Gin Asn Gly 
180 

CTT GAG ATT ATG GCG 
Leu Glu lie Met Ala 
195 

AAC GGA TAT CGC AGT 
Asn Gly Tyr Arg Ser 
210 

GCT GGG GTA GCT GCA 
Ala Gly Val Ala Ala 
225 

GCA CAA ATT CGT AAT 
Ala Gin lie Arg Asn 
245 

AGC ACG TAT TAT GGA 
Ser Thr Tyr Tyr Gly 
260 



GGG AAC GGA GTA AAA 384 
Gly Asn Gly Val Lys 
25 

CCT GAT TTA CAT ATT 432 
Pro Asp Leu His He 
40 

ACT TAT GTG GAT TAT 480 
Thr Tyr Val Asp Tyr 
55 

GCT GCC CTA AAC AAT 528 
Ala Ala Leu Asn Asn 
70 

GAA CTA TAT GCT GTT 576 
Glu Leu Tyr Ala Val 
90 

GCA TCC ATT GCT CAA 624 
Ala Ser He Ala Gin 
105 

ATT GCC AAC ATG AGT 672 
He Ala Asn Met Ser 
120 

TTA GCA GCA GAC CGC 720 
Leu Ala Ala Asp Arg 
135 

GCT GGA AAC TCA GGA 768 
Ala Gly Asn Ser Gly 
150 

GCG CGC TAT GCA TCT 816 
Ala Arg Tyr Ala Ser 
170 

AAT AGA GCG AAC TTT 864 
Asn Arg Ala Asn Phe 
185 

CCT GGT GTC AAT ATT 912 
Pro Gly Val Asn He 
200 

TTA AAT GGT ACG TCA 960 
Leu Asn Gly Thr Ser 
215 

TTA GTT AAA CAA AAA1008 
Leu Val Lys Gin Lys 
230 

CGT ATG AAT CAA ACA1056 
Arg Met Asn Gin Thr 
250 

AAT GGC TTA GTG GAT 1104 
Asn Gly Leu Val Asp 
265 

1122 
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Sequence Number: 2 

Length of Sequence: 20 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

GGAGGAGTAA GCTTTATCTC 

Sequence Number: 3 

Length of Sequence: 47 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

ACAGAAAACA CTTATGTGGA TTATAATGGT CACGGTACTC ACGTAGC 

Sequence Number: 4 

Length of Sequence: 47 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence : 

GGTACTGTA GCTG CCCT AA ACAATTCATA TGGCGTATTG GGAGTGGC 
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Sequence Number: 5 

Length of Sequence: 44 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

CCTGGAGCTG AACTATATGC TGTTAAAGTT CTTGATCGTA ACGG 

Sequence Number: 6 

Length of Sequence: 21 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

AGCGGTTCGC ATGCATCCAT T 

Sequence Number : 7 

Length of Sequence: 21 
Type of Sequence: Nucleic Acid 
Number of Strand: Single-stranded 
Topology: Linear 

Kind of Sequence: Other Nucleic Acid, Synthetic DNA 
Sequence: 

AATGGATGCA TGCGAACCGCT 
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Brief Description of the Drawings 

Fig. 1 is a schematic view showing one embodiment of the 
method for producing a nucleic acid pool of the present 
invention. 

5 Fig. 2 is a schematic view showing another embodiment of 

the method for producing a nucleic acid pool of the present 
invention. 

Fig. 3 is a restriction enzyme cleavage map of plasmid 
pHY812, in which the alkali protease gene derived from Bacillus 
10 NKS-21 has been ligated to plasmid pHY300PLK. 
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Claims 

1. A nucleic acid pool comprising two or more different 
nucleic acids, which is constructed by dividing all or a part 

5 of one or more genes into 3 or more blocks followed by ligating 
all or a part of these blocks into sequences that are different 
from the sequence or sequences of the original, non-divided 
gene or genes. 

2. The nucleic acid pool as claimed in claim 1, 
10 containing 10 or more different nucleic acids. 

3. The nucleic acid pool as claimed in claim 1 or 2 , 
containing all the nucleic acids with different sequences as 
constructed by re-sequencing a number, n, of said blocks (where 
n is the number of the different blocks as formed by the 

15 division) . 

4. The nucleic acid pool as claimed in any one of claims 
1 to 3, wherein the gene is a gene coding for a protein, and 
the amino acid sequence as encoded by each block is the same as 
the amino acid sequence as encoded by the corresponding part on 

20 the original gene. 

5. The nucleic acid pool as claimed in any one of claims 
1 to 4 , wherein the gene is a gene coding for an enzymatic 
function or a control gene for it. 

6. The nucleic acid pool as claimed in claim 5, wherein 
25 the gene is a gene coding for any one of proteases, lipases, 

cellulases, amylases, catalases, xylanases, oxidases, 
dehydrogenases, oxygenases and reductases. 

7. The nucleic acid pool as claimed in any one of claims 
1 to 6, wherein the gene is one derived from prokaryotes. 
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8. The nucleic acid pool as claimed in claim 7, wherein 
the gene is one derived from bacillus bacteria, 

9. The nucleic acid pool as claimed in claim 8, wherein 
the gene is a protease API21 gene. 

5 10, The nucleic acid pool as claimed in any one of claims 

1 to 9, wherein each block is an oligonucleotide. 

11. The nucleic acid pool as claimed in claim 10, wherein 
the nucleic acid is a single-stranded polynucleotide. 

12. The nucleic acid pool as claimed in claim 10, wherein 
10 the nucleic acid is a double-stranded polynucleotide. 

13. A method for producing a nucleic acid pool comprising 
two or more different nucleic acids, which comprises dividing 
all or a part of one or more genes into three or more 
oligonucleotide blocks or synthesizing oligonucleotides 

15 corresponding to said blocks, followed by ligating all or a 
part of these blocks into sequences that are different from 
those on genes. 

14. The method for producing a nucleic acid pool as 
claimed in claim 13, which comprises the following steps a) to 

20 c) : 

a) a step of preparing 3 or more blocks of single-stranded 
oligonucleotides having base sequences that correspond to all 
or a part of one or more genes through division of one or more 
genes or through synthesis of oligonucleotide chains having 

25 said base sequences; 

b) a step of adding a ribonucleotide to its 3' -terminal of 
each with a deoxyribonucleotide at the 3' -terminal of the 
oligonucleotide chain blocks as obtained in the previous step 



WO 98/05764 



40 



PCT/DK97/00316 



a), while adding a phosphoryl group to its 5 1 -terminal of each 
thereof with a hydroxyl group at the 5' -terminal; and 

c) a step of ligating in any desired sequence the 
oligonucleotide chain blocks as obtained in the previous step 
5b), by reacting the 3' -terminal ribonucleotide of one block 
with the 5 '-terminal phosphoryl group of another block. 

15. The method for producing a nucleic acid pool as 
claimed in claim 14, wherein the number of the blocks to be 
prepared in the step a) is 3 or more. 

10 16. The method for producing a nucleic acid pool as 

claimed in claim 14 or 15 , wherein at least one block is left 
to still have its 5' -terminal hydroxyl group in the step b) to 
thereby selectively obtain a nucleic acid or nucleic acids 
having said block at the 5' -terminal. 

15 17. The method for producing a nucleic acid pool as 

claimed in any one of claims 14 to 16, wherein at least one 
block is left to still have its 3' -terminal deoxyribonucleotide 
in the step b) to thereby selectively obtain a nucleic acid or 
nucleic acids having said block at the 3 f -terminal. 

20 18. The method for producing a nucleic acid pool as 

claimed in any one of claim 14 to 17 , wherein the blocks are 
prepared in such a manner that the amino acid sequence as 
encoded by each block is the same as the amino acid sequence as 
encoded by the corresponding part on the original gene. 

25 19. The method for producing a nucleic acid pool as 

claimed in claim 18 , wherein blocks each having a base sequence 
of from the (3p+l)th to the (3q+2)th, as counted from the 
starting point of the reading frame on a gene (where p and q 
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are integers to be independently determined for each block, 
provided that p , q) , are prepared in the step a) , and a 
ribonucleotide that corresponds to the (3q+3)th base, as 
counted in the same manner as above, or corresponds to a base 
5 that does not change the amino acid to be encoded at said site 
is added to each said block at its 3 1 -terminal in the step b) . 

20. The method for producing a nucleic acid pool as 
claimed in any one of claims 14 to 19, wherein the ligation of 
the step c) is conducted, using an RNA ligase in the presence 

10 of adenosine triphosphate and divalent metal ions. 

21. A method for producing a double-stranded nucleic acid 
pool, which comprises converting the single-stranded nucleic 
acids as obtained in any one of claims 14 to 2 0 into double- 
stranded ones through polymerase reaction. 

15 22. A genetic product to be obtained by expressing the 

genetic information that exists in the nucleic acid pool as set 
forth in any one of claims 1 to 12 . 
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